Cauca Department
Embedding-Aware Quantum-Classical SVMs for Scalable Quantum Machine Learning
Ordóñez, Sebastián Andrés Cajas, Torres, Luis Fernando Torres, Bifulco, Mario, Durán, Carlos Andrés, Bosch, Cristian, Carbajo, Ricardo Simón
Quantum Support Vector Machines face scalability challenges due to high-dimensional quantum states and hardware limitations. We propose an embedding-aware quantum-classical pipeline combining class-balanced k-means distillation with pretrained Vision Transformer embeddings. Our key finding: ViT embeddings uniquely enable quantum advantage, achieving up to 8.02% accuracy improvements over classical SVMs on Fashion-MNIST and 4.42% on MNIST, while CNN features show performance degradation. Using 16-qubit tensor network simulation via cuTensorNet, we provide the first systematic evidence that quantum kernel advantage depends critically on embedding choice, revealing fundamental synergy between transformer attention and quantum feature spaces. This provides a practical pathway for scalable quantum machine learning that leverages modern neural architectures.
- South America > Colombia > Cauca Department > Popayán (0.04)
- South America > Colombia > Antioquia Department > Medellín (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Information Technology (0.68)
- Health & Medicine > Therapeutic Area (0.46)
Using LLMs to create analytical datasets: A case study of reconstructing the historical memory of Colombia
Anderson, David, Benitez, Galia, Bjarnadottir, Margret, Reyya, Shriyan
Colombia has been submerged in decades of armed conflict, yet until recently, the systematic documentation of violence was not a priority for the Colombian government. This has resulted in a lack of publicly available conflict information and, consequently, a lack of historical accounts. This study contributes to Colombia's historical memory by utilizing GPT, a large language model (LLM), to read and answer questions about over 200,000 violence-related newspaper articles in Spanish. We use the resulting dataset to conduct both descriptive analysis and a study of the relationship between violence and the eradication of coca crops, offering an example of policy analyses that such data can support. Our study demonstrates how LLMs have opened new research opportunities by enabling examinations of large text corpora at a previously infeasible depth.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- South America > Colombia > Bolivar Department (0.04)
- South America > Colombia > Southwest Colombia (0.04)
- (7 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (1.00)
- Government > Military (1.00)
- Government > Regional Government > South America Government > Colombia Government (0.34)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
DF-DM: A foundational process model for multimodal data fusion in the artificial intelligence era
Restrepo, David, Wu, Chenwei, Vásquez-Venegas, Constanza, Nakayama, Luis Filipe, Celi, Leo Anthony, López, Diego M
In the big data era, integrating diverse data modalities poses significant challenges, particularly in complex fields like healthcare. This paper introduces a new process model for multimodal Data Fusion for Data Mining, integrating embeddings and the Cross-Industry Standard Process for Data Mining with the existing Data Fusion Information Group model. Our model aims to decrease computational costs, complexity, and bias while improving efficiency and reliability. We also propose "disentangled dense fusion", a novel embedding fusion method designed to optimize mutual information and facilitate dense inter-modality feature interaction, thereby minimizing redundant information. We demonstrate the model's efficacy through three use cases: predicting diabetic retinopathy using retinal images and patient metadata, domestic violence prediction employing satellite imagery, internet, and census data, and identifying clinical and demographic features from radiography images and clinical notes. The model achieved a Macro F1 score of 0.92 in diabetic retinopathy prediction, an R-squared of 0.854 and sMAPE of 24.868 in domestic violence prediction, and a macro AUC of 0.92 and 0.99 for disease prediction and sex classification, respectively, in radiological analysis. These results underscore the Data Fusion for Data Mining model's potential to significantly impact multimodal data processing, promoting its adoption in diverse, resource-constrained settings.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- (18 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.94)
- Research Report > Promising Solution (0.67)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.69)
- (2 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Data Science > Data Integration (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications
Restrepo, David, Wu, Chenwei, Cajas, Sebastián Andrés, Nakayama, Luis Filipe, Celi, Leo Anthony, López, Diego M
Large-scale multi-modal deep learning models have revolutionized domains such as healthcare, highlighting the importance of computational power. However, in resource-constrained regions like Low and Middle-Income Countries (LMICs), limited access to GPUs and data poses significant challenges, often leaving CPUs as the sole resource. To address this, we advocate for leveraging vector embeddings to enable flexible and efficient computational methodologies, democratizing multimodal deep learning across diverse contexts. Our paper investigates the efficiency and effectiveness of using vector embeddings from single-modal foundation models and multi-modal Vision-Language Models (VLMs) for multimodal deep learning in low-resource environments, particularly in healthcare. Additionally, we propose a simple yet effective inference-time method to enhance performance by aligning image-text embeddings. Comparing these approaches with traditional methods, we assess their impact on computational efficiency and model performance using metrics like accuracy, F1-score, inference time, training time, and memory usage across three medical modalities: BRSET (ophthalmology), HAM10000 (dermatology), and SatelliteBench (public health). Our findings show that embeddings reduce computational demands without compromising model performance. Furthermore, our alignment method improves performance in medical tasks. This research promotes sustainable AI practices by optimizing resources in constrained environments, highlighting the potential of embedding-based approaches for efficient multimodal learning. Vector embeddings democratize multimodal deep learning in LMICs, particularly in healthcare, enhancing AI adaptability in varied use cases.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (7 more...)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
- Health & Medicine > Therapeutic Area > Dermatology (0.66)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.48)